Variable Selection for Meaningful Clustering of Multitopic Territorial Data
نویسندگان
چکیده
This paper proposes a new methodology to improve territorial cohesion in clustering processes where many variables from different topics are considered. Clustering techniques provide added value identify typologies, but there still unsolved challenges when data contain an unbalanced number of topics. The feature selection method (TFSM) is presented as select the representative variable each topic such that interpretability resulting clusters preserved and geographical improved with respect classical approaches. also introduces thermometer knowledge acquisition tool allows experts transfer semantics mining process. TFSM index potential explainability (Ek) criteria most promising for clustering. Ek based on combination inferential testing metrics support. proposal applied INSESS-COVID19 database, groups vulnerable populations were found. A set 195 21 thematic blocks used compare results traditional multiview analysis both point view capacity support further decision making.
منابع مشابه
Variable Selection for Model-Based Clustering
We consider the problem of variable or feature selection for model-based clustering. We recast the problem of comparing two nested subsets of variables as a model comparison problem, and address it using approximate Bayes factors. We develop a greedy search algorithm for finding a local optimum in model space. The resulting method selects variables (or features), the number of clusters, and the...
متن کاملVariable selection in model-based clustering using multilocus genotype data
We propose a variable selection procedure in model-based clustering multilocus genotype data. Indeed, it may happen that some loci are not relevant for clustering into statistically different populations. Inferring the number K of clusters and the relevant clustering subset S of loci is regarded as a model selection problem. The competing models are compared using penalized maximum likelihood c...
متن کاملBayesian Variable Selection in Clustering High-Dimensional Data With Substructure
In this article we focus on clustering techniques recently proposed for highdimensional data that incorporate variable selection and extend them to the modeling of data with a known substructure, such as the structure imposed by an experimental design. Our method essentially approximates the within-group covariance by facilitating clustering without disrupting the groups defined by the experime...
متن کاملBayesian Variable Selection in Clustering High-Dimensional Data
Over the last decade, technological advances have generated an explosion of data with substantially smaller sample size relative to the number of covariates (p n). A common goal in the analysis of such data involves uncovering the group structure of the observations and identifying the discriminating variables. In this article we propose a methodology for addressing these problems simultaneousl...
متن کاملOptimal Feature Selection for Data Classification and Clustering: Techniques and Guidelines
In this paper, principles and existing feature selection methods for classifying and clustering data be introduced. To that end, categorizing frameworks for finding selected subsets, namely, search-based and non-search based procedures as well as evaluation criteria and data mining tasks are discussed. In the following, a platform is developed as an intermediate step toward developing an intell...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Mathematics
سال: 2023
ISSN: ['2227-7390']
DOI: https://doi.org/10.3390/math11132863